96 research outputs found

    3-D Motion Estimation and Wireframe Adaptation Including Photometric Effects for Model-Based Coding of Facial Image Sequences

    Get PDF
    Cataloged from PDF version of article.We propose a novel formulation where 3-D global and local motion estimation and the adaptation of a generic wireframe model to a particular speaker are considered simultaneously within an optical flow based framework including the photometric effects of the motion. We use a flexible wireframe model whose local structure is characterized by the normal vectors of the patches which are related to the coordinates of the nodes. Geometrical constraints that describe the propagation of the movement of the nodes are introduced, which are then efficiently utilized to reduce the number of independent structure parameters. A stochastic relaxation algorithm has been used to determine optimum global motion estimates and the parameters describing the structure of the wireframe model. Results with both simulated and real facial image sequences are provided

    DCT Coding of nonrectangularly sampled images

    Get PDF
    Cataloged from PDF version of article.Discrete cosine transform ( DCT) coding is widely used for compression of rectangularly sampled images. In this letter, we address efficient DCT coding of rectangularly sampled images. To this effect, we discuss an efficient method for the computation of the DCT on nonrectangular sampling grids using the Smith-normal decomposition. Simulation are provided

    An improvement to MBASIC algorithm for 3D motion and depth estimation

    Get PDF
    Cataloged from PDF version of article.In model-based coding of facial images, the accuracy of motion and depth parameter estimates strongly affects the coding efficiency. MBASIC is a simple and effective iterative algorithm (recently proposed by Aizawa et al.) for 3-D motion and depth estimation when the initial depth estimates are relatively accurate. In this correspondence, we analyze its performance in the presence of errors in the initial depth estimates and propose a modification to MBASIC algorithm that significantly improves its robustness to random errors with only a small increase in the computational load

    Effect of Architectures and Training Methods on the Performance of Learned Video Frame Prediction

    Full text link
    We analyze the performance of feedforward vs. recurrent neural network (RNN) architectures and associated training methods for learned frame prediction. To this effect, we trained a residual fully convolutional neural network (FCNN), a convolutional RNN (CRNN), and a convolutional long short-term memory (CLSTM) network for next frame prediction using the mean square loss. We performed both stateless and stateful training for recurrent networks. Experimental results show that the residual FCNN architecture performs the best in terms of peak signal to noise ratio (PSNR) at the expense of higher training and test (inference) computational complexity. The CRNN can be trained stably and very efficiently using the stateful truncated backpropagation through time procedure, and it requires an order of magnitude less inference runtime to achieve near real-time frame prediction with an acceptable performance.Comment: Accepted for publication at IEEE ICIP 201

    Multi-Scale Deformable Alignment and Content-Adaptive Inference for Flexible-Rate Bi-Directional Video Compression

    Full text link
    The lack of ability to adapt the motion compensation model to video content is an important limitation of current end-to-end learned video compression models. This paper advances the state-of-the-art by proposing an adaptive motion-compensation model for end-to-end rate-distortion optimized hierarchical bi-directional video compression. In particular, we propose two novelties: i) a multi-scale deformable alignment scheme at the feature level combined with multi-scale conditional coding, ii) motion-content adaptive inference. In addition, we employ a gain unit, which enables a single model to operate at multiple rate-distortion operating points. We also exploit the gain unit to control bit allocation among intra-coded vs. bi-directionally coded frames by fine tuning corresponding models for truly flexible-rate learned video coding. Experimental results demonstrate state-of-the-art rate-distortion performance exceeding those of all prior art in learned video coding.Comment: Accepted for publication in IEEE International Conference on Image Processing (ICIP) 202

    3D human action recognition in multiple view scenarios

    Get PDF
    This paper presents a novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras. In contraposition with other multi-ocular gesture recognition systems based on generating a classification on a fusion of features coming from different views, our system performs a data fusion (3D representation of the scene) and then a feature extraction and classification. Motion descriptors introduced by Bobick et al. for 2D data are extended to 3D and a set of features based on 3D invariant statistical moments are computed. Finally, a Bayesian classifier is employed to perform recognition over a small set of actions. Results are provided showing the effectiveness of the proposed algorithm in a SmartRoom scenario.Peer ReviewedPostprint (published version

    Report of CE on Semantic DS

    Get PDF
    ISO/IEC JTC1/SC29/WG11, MPEG00/M6355, 53rd meeting, Jul. 2000, Beijing, PR

    Quantum state-dependent diffusion and multiplicative noise: a microscopic approach

    Full text link
    The state-dependent diffusion, which concerns the Brownian motion of a particle in inhomogeneous media has been described phenomenologically in a number of ways. Based on a system-reservoir nonlinear coupling model we present a microscopic approach to quantum state-dependent diffusion and multiplicative noise in terms of a quantum Markovian Langevin description and an associated Fokker-Planck equation in position space in the overdamped limit. We examine the thermodynamic consistency and explore the possibility of observing a quantum current, a generic quantum effect, as a consequence of this state-dependent diffusion similar to one proposed by B\"{u}ttiker [Z. Phys. B {\bf 68}, 161 (1987)] in a classical context several years ago.Comment: To be published in Journal of Statistical Physics 28 pages, 3 figure
    corecore